Qualcomm AI Engine Direct - Fix UT example script hang when exception happened#4355
Qualcomm AI Engine Direct - Fix UT example script hang when exception happened#4355winskuo-quic wants to merge 2 commits intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/4355
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit b41626f with merge base 5a20a49 ( FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
Hi @cccclai , |
|
@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
d17d18a to
b41626f
Compare
|
Hi @cccclai, It seems like there are some failures in CI, so I have force pushed another commit to trigger CI.
Please have a look. |
|
@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
| quantizer = QnnQuantizer() | ||
| quantizer.add_custom_quant_annotations(custom_annotations) | ||
| quantizer.set_per_channel_linear_quant(per_channel_linear) | ||
| quantizer.set_per_channel_conv_quant(True) |
There was a problem hiding this comment.
Is it always true or configurable?
There was a problem hiding this comment.
I think we set it always to true as there are couple of models having bad accuracy due to turning off per_channel_conv.
| self._update_per_channel_weight_quant_ops(linear_ops, enable) | ||
|
|
||
| def transform_for_annotation(self, model: GraphModule) -> GraphModule: | ||
| model = RemoveRedundancy()(model).graph_module |
There was a problem hiding this comment.
Should we have a follow up to add this back? Feel like we may have perf regresssion without it
There was a problem hiding this comment.
Thanks for reviewing. Although the pass RemoveRedundancy() is removed during quantizer, it will still be called during capture_program(), so the final performance should be the same.
executorch/backends/qualcomm/utils/utils.py
Line 196 in 7f6a341
Summary: